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Zoonotic transmission of novel viruses represents a significant 
threat to global public health and is fueled by globalization, the 
loss of natural habitats, and exposure to new hosts. For 
coronaviruses (CoVs), broad diversity exists within bat 
populations and uniquely positions them to seed future 
emergence events. In this review, we explore the host and viral 
dynamics that shape these CoV populations for survival, 
amplification, and possible emergence in novel hosts. 
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Introduction 

In the past decade, molecular techniques have expanded 
identification of zoonotic viruses, including coronaviruses 
(CoVs) [1]. Traditionally, approaches for viral identifica¬ 
tion have included culturing, antigen staining, electron 
microscopy, and serology [2]; however, these techniques 
were inherently biased towards known viral families and 
were largely insensitive to uncharacterized species. In 
contrast, molecular diagnostics rapidly identified 
unknown pathogens starting with Sin Nombre virus in 
the late 20th century, continuing with SARS-CoV in the 
early part of this century, and most recently with MERS- 
CoV [3-5]. As the molecular approaches improved, these 
techniques have become standard in identifying infec¬ 
tious agents in both acute and chronic disease settings. 
Coupled with reduced cost, these new approaches have 
permitted application for pathogen discovery; the number 
of known CoVs has increased substantially, aided by both 
surveys of animal populations and infrastructure invest¬ 
ments to improve diagnostic capacity in disease hotspots 
[6]. Importantly, the resulting inventory illustrates the 
broad diversity harbored in zoonotic hosts and the pres¬ 
ence of quasi-species that may serve as a reservoir for CoV 


persistence. In this review, we examine how both bat 
hosts and the CoVs that they harbor may be uniquely 
positioned to seed future emergence events, especially as 
human populations increase and penetrate the undevel¬ 
oped regions of the world. 

Bats reservoirs: shaping virus emergence 

While numerous animals have been surveyed in the past 
decade, bats continue to be among the most abundant 
source for novel viral sequences [ 7 ]. Bat species are 
among the oldest mammals and represent 20% of mam¬ 
malian diversity [8]; they exist and occupy diverse niches 
from isolated individuals to large commensal colonies 
with broad geographic ranges that can span thousands 
of miles. Importantly, their great diversity and long co¬ 
evolutionary relationships with pathogens provide the 
opportunity for cross species mixing and maintenance 
of quasi-species pools of viruses that can infect a range of 
hosts [9,10]. Yet, despite harboring such a diverse assort¬ 
ment of viruses, surveyed bats rarely exhibit signs of 
disease. Several hypotheses have been proposed to 
explain these asymptomatic infections. One postulates 
that bats, the only flying mammal, produce large amounts 
of reactive oxygen species (ROS) and, in response, have 
modulated genes to limit oxidative stress [11], which may 
result in reduced viral replication and pathogenesis [12]. 
Similarly, a modified innate immune response may also 
contribute to the diverse viral pools harbored by bats. 
Known PYHIN (PYRIN and HIN domain-containing) 
genes within the inflammasome pathway and natural 
killer immunoglobulin-like receptors (KIRs) are absent 
or significantly reduced in some surveyed bat species, 
potentially limiting disease and damage following infec¬ 
tion [11,13]. In addition, constitutive expression of bat 
interferon subtypes likely limits disease but permits low- 
level viral infection to remain intact [14]. A third possi¬ 
bility suggests a commensal relationship between the 
harbored viruses and bat species [15]. As primarily iden¬ 
tified from enteric samples (i.e., bat guano), these pools of 
viruses may serve a critical role in the bat microbiome to 
prime immunity, a concept similarly proposed for humans 
with herpes viruses [16]. Finally, enteric infection repre¬ 
sents a significantly different tissue than the respiratory 
tract in terms of disease and adaptive immunity; thus, 
virus tropism differences between species and tissues 
may also contribute to limiting disease in bats. Similarly, 
while recent work has shown intact elements of adaptive 
immunity in bat species [17-19], the enteric location may 
generate a dampened adaptive response that permits viral 
maintenance similar to the members of the microbiome in 
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humans [20]. Together, these factors likely work in 
combination and indicate how diverse pools of CoV 
quasi-species can survive in bat populations. 

While bat species maintain factors that permit virus 
persistence, the unique host environment also promotes 
broad diversity in CoV quasi-species pools. As a result of 
flight, accumulation of ROS species may occur for short 
periods of time and have been shown to have mutagenic 
effects, potentially overwhelming CoV proofreading 
repair and/or altering viral polymerase fidelity and 
increasing species diversity, a possible key to cross-spe¬ 
cies transmission [21]. Similarly, the constitutive expres¬ 
sion of type I IFN in bat hosts may select for advanta¬ 
geous viral mutations that enhance resistance to innate 
immune antiviral defense pathways and provide a repli¬ 
cation advantage, especially after cross species transmis¬ 
sion [14]. Conversely, the absence of key inflammatory 
mediators in bat species provides no selective pressure to 
minimize these responses [13]; subsequently, infection of 
a new host could result in massive and pathogenic inflam¬ 
mation responses, as seen with both SARS-CoV and 
MERS-CoV infections in humans [22,23]. Overall, the 
unique aspects that permit quasi-species pools of viruses 
in bats also contribute to their diversity and potential to 
emerge in new species. 

Balancing act: honing CoV survival and 
emergence 

While bats provide a critical foreground, emergence of 
CoVs requires that key viral factors be altered to 


overcome species barriers without sacrificing the form 
or function of other important elements. This dichot¬ 
omy in CoVs is governed by two distinct mechanisms: 
fidelity and gene acquisition (Figure 1). A major limita¬ 
tion to RNA virus capacity is the need to minimize 
sequence length to survive error catastrophe [24]. How¬ 
ever, CoVs, as some of the largest members of the 
Nidovirales order, have overcome this barrier by pro¬ 
ducing a large replication complex with known RNA 
synthesis and modification activities that include a 
proofreading machine, mediated primarily via the 3'- 
5' exoribonuclease activity of non-structural protein 
(nsp) 14 [25]. As such, this large and complex RNA 
replication machinery has allowed CoVs to achieve 
upwards of 32 kb in size while maintaining the func¬ 
tional components required for viability. Coupled with 
robust fidelity, CoVs have also used recombination, 
horizontal gene transfer, gene duplication, and alterna¬ 
tive open reading frames to expand the functional 
capacity for its current and new hosts [26]. Together, 
both fidelity and gene acquisition have honed and 
refined CoV proteins, which can be divided into three 
broad groups based on selective pressure: spike, con¬ 
served, and variable proteins (Figure 1). For a novel 
CoV to emerge, these three groups must function in 
harmony, providing sufficient changes to overcome spe¬ 
cies barriers while maintaining key viral functions. 

Keying in: spike drives emergence 

Charged with binding the host receptor, the spike protein 
of CoVs governs species specificity and is a critical target 


Figure 1 
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Balancing coronavirus emergence. Bat populations maintain a unique environment that facilitates survival and maintenance of diverse pools of 
viruses. To overcome species barriers, CoV must modify some key viral factors while maintaining others. Two mechanisms govern this balance: 
fidelity and gene modulation. Using these processes, CoVs shape their proteins conserving some (viral enzymes, structural proteins, spike S2) 
while modifying others (non-structural proteins, accessory proteins, spike SI). The resulting pools therefore maintain viability while also possessing 
tools necessary for emergence. 
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for host immunity [27]. Divided into two parts, the SI 
portion forms the globular head of the spike trimer 
(Figure 2a), drives receptor engagement, and is variable 
across and within CoV groups (figure 2b) [28,29]. In 
contrast, the S2 domain maintains the entry machinery 
and requires more conservation across the CoV family 
(Figure 2a, b). With binding required for infection, muta¬ 
tions within SI, and most notably, the receptor-binding 
domain (RBD), have been thought to be critical for CoV 
emergence [30]. Using chimeric viruses employing civet, 
early, and middle-phase spike proteins demonstrated 
viability for the closely related strains in human cells 
[31,32]. However, for some strains, such as SZ16 and bat- 
derived HKU3-CoV, the closest known SARS-CoV pro¬ 
genitors at the time, progeny virions were not recoverable 
in Vero or primary human airway epithelial cells, despite 
evidence of RNA replication [30,32]. To overcome this 
barrier, single humanizing mutation K479N was intro¬ 
duced into SZ16 and a chimeric HKU3 virus containing 
the RBD of SARS-CoV was designed and permitted 
replication, likely due to its capacity to bind the human 
ACE2 receptor [30,31]. A similar approach was used with 
group 2C CoV HKU5; substitution of the entire ecto- 
domain from SARS-CoV spike resulted in an HKU5 virus 
that was able to infect human cells [33]. Together, the 
data argue that the ability of the spike to bind receptor is 
required for viability in novel hosts. 

Figure 2 


However, more recent advances identified bat CoV spike 
proteins that could produce robust infection without 
manipulation [34,35]. Building from sequences closely 
related to the epidemic SARS-CoV strains [36], chimeric 
viruses employing the spike sequences from SHC014 and 
WIV1 clusters produced CoVs capable of replicating in 
human cells and causing disease in vivo [34,35]. Coupled 
with the discovery of sequences even more closely related 
to the epidemic SARS-CoV strains and evidence of robust 
SI recombination [37], the results suggest that extensive 
mutation of the spike RBD may not be the only correlate 
for infection of human hosts. Notably, both chimeric 
viruses were attenuated relative to the epidemic strain, 
suggesting that adaptation within the new host contrib¬ 
utes to disease and pathogenesis [34,35]. Yet, it remains 
unclear if these mutations occur exclusively within the SI 
portion of spike or if subtle changes in the S2 region 
contribute to enhanced disease by interfacing with sur¬ 
face and intracellular proteases that function in entry and 
egress [38,39]. 

Mainstays and accessories: adding tools but 
keeping a base 

The CoV spike protein captures a critical dichotomy 
necessary for emergence, employing enough novelty in 
its SI region to bind new host receptors while conserving 
functional entry activity in its S2 portion. However, while 
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Conservation and modification of spike protein. The CoV spike protein is critical receptor binding and entry. Therefore, while modification is likely 
required for infection of new species, the spike protein must also maintain its entry mechanism, (a) Structure of MHV-CoV spike trimer (adapted 
from Ref. [53]), dividing the protein into SI globular head portions (blue), and S2 conserved stalk (green), (b) Heat maps were constructed from a 
set of representative coronaviruses from all four genogroups using alignment data paired with neighbor-joining phylogenetic trees built in Geneious 
(v.9.1.5) and visualized in EvolView (evolgenius.info). Trees show the degree of genetic similarity of SI and S2 domains of the spike glycoprotein. 
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Figure 3 
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Maintenance and change the CoV backbone. Changes to the CoV backbone can aid emergence, but must be balanced against conservation of 
other elements, (a) Genomic structure of SARS-CoV with proteins predicted to be conserved (blue), variable (red), or in between (purple), (b) Heat 
maps were constructed from a set of representative coronaviruses from all four genogroups using alignment data paired with neighbor-joining 
phylogenetic trees built in Geneious (v.9.1.5) and visualized in EvolView (evolgenius.info). Trees show the degree of genetic similarity of ORF6, 
NSP2, nucleocapsid, and NSP14 across genera. 


critical for infection of new hosts, changing the spike 
protein alone is not sufficient to cause epidemic disease 
[34,35]; therefore, changes within the backbone are also 
necessary to speed emergence. Yet, the same dichotomy 
seen with the spike glycoproteins is necessary in balanc¬ 
ing change within the CoV backbone. Certain elements, 
most notably accessory proteins, may be added or modi¬ 
fied to enhance infection within new hosts. In contrast, 


other viral motifs and proteins must be conserved to 
maintain virus functionality. For each, CoV fidelity, 
recombination, and evolutionary pressure hone and refine 
these genes, providing a framework for emergence in a 
new species to occur. 

For highly conserved viral functions, the presence of CoV 
fidelity machinery provides an important means to 
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maintain these activities in the context of an expansive 
genome. Broadly, these conserved viral proteins can be 
categorized into structural and enzymatically active 
groups (Figure 3a). For structural proteins, including 
the nucleocapsid (N), matrix (M), and envelope (E), high 
within-group conservation is maintained, with more mod¬ 
est similarity seen across the entire CoV family 
(Figure 3b). This level of conservation, similar to the 
S2 portion of spike, suggests the need to maintain func¬ 
tional interaction for the formation of viral particles. 
Similarly, ORFlab polyprotein genes find a distinction, 
with genes involved in protease cleavage and the replica¬ 
tion complex having high levels of similarity across CoV 
families. For example, enzymatically active proteins, such 
as nspl4 and nspl6, maintain very high conservation, 
likely due to their specific functions in proofreading 
and Z'O methylation of nascent RNA [25,40] (Figure 3). 
For both groups, some mutational space is available, 
accounting for differences across the family; however, 
function must be maintained to ensure CoV survival. 

In contrast, accessory proteins distinguish CoV infections 
from each other, with high variability across the family, 
allowing viruses to adapt to current and novel hosts. The 
majority of these genes have been characterized in the 
context of antagonizing host immune responses, most 
notably type I IFN pathways [41]. However, the func¬ 
tions of these proteins may extend beyond host immunity 
and may be species-specific. For example, the SARS-CoV 
accessory protein ORF6 was initially characterized based 
on its capacity to interfere with STAT1 nuclear localiza¬ 
tion [42]. Further study indicated that modulation of the 
IFN responses was a byproduct of karyopherin transport 
and had a significant impact on host modulation beyond 
type I IFN at late times post-infection [42,43]. Notably, 
protein-coding sequences similar to SARS-CoV ORF6 are 
not readily detected beyond the group 2B CoV family, 
suggesting a more recent acquisition (Figure 3). Similarly, 
SARS ORF8 has undergone significant modification, with 
a 29-nucleotide deletion found in epidemic strains result¬ 
ing in two novel proteins (ORF8a and 8b) [44]; coupled 
with reports of human isolates with larger deletions, these 
results suggest that the epidemic strain may be removing 
a protein only necessary for survival in bats [45]. Even for 
viral genes within the ORFlab polyprotein, significant 
changes can be noted across viral families. Nsp2, cleaved 
co-translationally from nsp3 and present in some form in 
all CoV, is responsible for a wide variety of activities and 
has minimal cross-genus sequence homology, although 
within groups, similarities are variable (Figure 3) [46-48]. 
Together, these results argue that across the CoV family, 
significant differences in accessory proteins can modulate 
and change infection aspects, including kinetics, severity, 
and species. 

Yet, even within more closely related subgroups, novel 
genes can appear from diverse sources and potentially 


fuel emergence. The recent discovery and characteriza¬ 
tion of two closely related SARS-like viruses, WIV1 and 
WIV16, revealed a novel acc.essory protein, ORFX, 
which was not found in the epidemic SARS-CoV strains 
[49]. Containing no sequence homology to any known 
proteins, the novel gene modulates type I IFN and 
activates NFkB signaling pathways, suggesting a role 
in modulating host immunity. While the majority of 
accessory proteins are thought to be acquired from the 
host, recent work suggests that novel CoV proteins can 
even be taken from other pathogens [50]. Identification of 
a novel coronavirus (Ro-BatCoV GCCDC1) also revealed 
the presence of a unique 3' protein with homology to a 
known reovirus gene; a similar finding with the hemaglu- 
tinin-esterase in a subset of CoV further suggests the 
possibility of recombination events occurring between 
viral families [8,51]. Together, the results indicate that 
CoVs can sample, acquire, and maintain a range of diverse 
proteins that may be critical for maintenance in natural 
hosts and emergence in new species. 

Conclusion 

With permissive natural hosts and inherent tools to bal¬ 
ance gene modulation/maintenance, CoVs are uniquely 
positioned to emerge in novel hosts. For both the epi¬ 
demic strains (SARS and MERS-CoV) and contemporary 
human strains (HCov 229E, NL63, OC43), significant 
human disease may be the outcome of cross-species 
transmission. Importantly, opportunities exist to utilize 
metagenomics data to prepare and possibly mitigate 
future emergence events. In seeking these goals, 
researchers need to consider the factors that drive emer¬ 
gence. In determinations of potential threats, exploring 
the variable spike SI portion of bat CoVs to identify 
viruses capable of binding to human receptors is key. 
Similarly, targeting highly conserved genes like the S2 
region of spike has allowed for the development of 
therapeutics with broad efficacy against current and 
potential future CoVs that emerge [28,52]. In addition, 
understanding the mechanisms and impact of highly 
variable genes provides another metric for threat and 
identifies targets for the generation of attenuated vaccine 
strains. Together, these approaches provide a platform to 
leverage our understanding of how CoVs emerge from bat 
sources to prepare and potentially stem future disease 
outbreaks. With globalization, habitat loss in developing 
nations, and uneven public health infrastructures, the 
survival and amplification of novel CoVs in bat popula¬ 
tions is now a lurking threat that requires immediate 
attention and preparation. 
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